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METHOD OF PRODUCING HUMAN GROWTH FACTORS FROM 
WHOLE PLANTS OR PLANT CELL CULTURES 



FIELD OF THE INVENTION 

The present invention relates generally to a method for producing human 
10 growth factors from whole plants or plant cell culture. More specifically, the 

invention relates to producing a human growth factor from a plant cell encoded to 
priKlucc the human growth factor with a length of at least 200 amino acids from 
traasccnic plant cells. 

15 BACKGROUND OF THE INVENTION 

Growth factors and monoclonal antibodies (Mabs) are diverse yet highly 
specialized types of proteins having research and commercial applications in areas 
of therapeutics and diagnostics. 

20 Therapeutic uses of human epidermal growth factor (hEGF) include treatment 

of soft tissue wounds (U.S. 5,218,093, 1993), specifically including skin and eye 
injuries as well as corneal and stomach ulcers (Frost and Sullivan 1996, 1994). In 
addition, several hEGF-bearing fusion constructs have been considered and/or 
tested, including mitotoxins for treatment of restenosis (Frost and Sullivan, 1994) 

25 and radioconjugates for a variety of anti-neoplastic therapies (Grieg et al., 1988). 

Current production techniques for these proteins such as hybridoma and other 
types of mammalian cell culture methods (Kdhler and Milsten, 1975) are generally 
slow, lahor intensive, and consequently, expensive. In addition, current 
priKluctiun techniques are difficult to validate due to the pathogenic and oncogenic 

30 potential of cultivated mannmalian tissue. 

Multimers of from 2 to 7 EGF units each having 53 amino acid residues 
have been produced from bacterial hosts, eg E. coli, Streptomyces and Bacillus, 
furiiial hosts, eg Saccharomyces, Pichia and Aspergillus, insect cell host, and 
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mammalian cell hosts, eg CHO cells and COS cells. (U.S. Patent No. 5218093, 
1993). hEGF production in Staphylococcus aureus (U.S. Patent No. 5004686, 
1991) is by a fusion construct encoding hEGF linked to a protein. Synthesis 
methods using transgenic bacterial strains have problems such as faulty antibody 
5 gene expression, protein folding difficulties, inability to glycosylate proteins, and 
relegation of foreign peptides to insoluble material accumulated in inclusion 
bodies. 

Transgenic plants can be used for the production of high value, medicinally 
important proteins, for example, production of Mabs (Hiatt et al., 1989; During et 

10 al., 1990; Benvenuto et al. 1991, Firek et al. 1993, Gao et al. 1993). human 
growth hormone (Kay et al. 1986) and human serum albumin (Sijmons et al. 
1990). Transformed cells synthesize, secrete, and accumulate functional 
antibodies including single (Benvenuto et al. 1991) and double (During et al. 1990, 
Hiatt et al. 1991) domain immunoglobulins. However, it is noted that none of 

15 these authors investigated production of any human growth factor from transgenic 
plants. 

Plant cell culture media are well-defmed and inexpensive compared to 
mammalian cell culture media. Further, plant cell products, unlike mammalian- 
derived protein formulations, are generally assumed as neither pathogenic nor 

20 oncogenic to humans (Crawford, 1995). Also, when compared to similar 

production in transgenic bacterial strains (Attaai and Shuler 1987), plant tissue 
culture methods showed greater stability of foreign gene expression, even without 
use of selection pressure (Gao et al. 1991). One author, Higo et al. (1993) 
produced a human growth factor, specifically hEGF in transgenic tobacco with 

25 cDNA fragment size of 180 bp. Unsatisfactory foreign peptide levels of 20 to 60 
pg/mg (ppb) total soluble leaf protein were obtained. This is despite the fact that 
plant progeny appeared to produce high levels of hEGF mRNA. Exact reasons for 
low observed levels of hEGF production are unclear. However, no signal peptide 
was encoded upstream of hEGF cDNA which could cause the foreign protein to be 
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relegated to the cytosol. Within this cell fraction. hEGF suffers proteolytic attack, 
especially considering the relatively small size (53 amino acids) or the peptide. 

Although advantages have been observed for deriving proteins including EGF 
from plants, no transgenic plant cell culture process has been commercially 
5 developed for production of human growth factor. The lack of conunercial 
exploitation of plant derived proteins is due in part to existing technological 
hurdles as observed by Higo et al. In addition, Ma et a!., 1995 reported Mab 
titers of up to 500 ftg/g (ppm) fresh weight of plant material (or 300 mg/L on a 
cell culture basis) whereas comparable mammalian cell processes are reported to 

10 attain levels of 1-2 g/L and higher (Rosenberg, personal communication, 1995). 
Implementation of alternative production systems to manunalian and bacterial 
culmre, such as plant cellular techniques, has been further limited by non- 
technological factors, such as industry and regulatory acceptance (Simonsen and 
McGrogan, 1994) because of the investment made in developing and validating the 

15 more established non-plant methods. 

Accordingly, there is a continuing need for plant based production of human 
growth factors. 

Background References 

20 
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18. Ma JKC, Hiatt A, Hein M, Vine N, Wang F, Stabila P. van DoUeweerd C, 
Mostov K, Lehner T. 1995. Generation and Assembly of Secretory 
Antibodies in Plants. Science 268:716-719. 

5 19. Simonsen CC, McGrogan M. 1994. The molecular biology of production 
cell lines. Biologicals 22:85-94. 

SUMMARY OF THE INVENTION 

10 Despite the hurdles in technology development and cominerciaiization, 

economic analysis indicates that regulatory costs associated with plant cell culture 
may reduce by as much as $70,000 per batch as compared to analogous 
mammalian cell processes (Crawford, 1995). In addition, direct production costs 
for whole plant processes at equal protein production rates appear to be two to 

15 four orders-of-magninide lower than comparable mammalian cell processes 
(Agracctus 1995). Additionally, as plant cell titers increase, this type of 
prtxluction becomes even more capital cost-effective. 

1 1 is, therefore, an object of the present invention to provide whole plant and 
plant cell culture derived human growth factors at higher overall concentrations 

20 and production rates, comparable to mammalian host cell systems. 

It is a further object of the present invention to synthesize specific human 
growth factors. 

It is another object of the present invention to increase production rates and 
concentrations by increasing protein stability through the use of fusion constructs. 
25 It is a further object of the present invention to use Nicotiana tabacum 

(tobacco) and Solanum tuberosum (potato) whole plants and highly synchronous 
suspensions. 

According to the present invention, the production of human growth factors 
is achieved in whole plants or plant cell culture wherein the human growth factor 
30 is produced with a length of at least 200 amino acids. For epidermal growth 
factor this would comprise at least a tetramer of EGF units. 
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Modifying chimeric cDNA and subcioning into a plant expression vector are 
done using standard molecular cloning procedures (Ausubel et al. 1992) and 
splicing PGR techniques (Marks et al. 1992). 

Effectiveness or production of the translation process has been increased 
5 according to the present invention by (1) cloning of pre-pro-EGF cDNA of 
approximately 4.5 kb into both whole plants and cell culmre to increase overall 
titers of active hEGF, (2) synthesizing cDNA and transforming plants and cell 
culture for production of an oligomeric polypeptide consisting of repeated hEGF 
domains, and (3) increasing the overall size of the gene to be expressed with a 

10 fusion construct encoding hEGF linked to a protein that is efficiently produced in 
plant systems. As needed, synthetic cDNA includes plant-specific proteolytic 
cleavage sites between EGF repeats to facilitate correct processing in planta. 
Appropriate proteolytic cleavage sites upstream and downstream of hEGF are 
added if needed to obtain fmal product. 

15 The subject matter of the present invention is particularly pointed out and 

distinctly claimed in the concluding portion of this specification. However, both 
the organization and method of operation, together with farther advanuges and 
objects thereof, may best be understood by reference to the following description 
taken in connection with accompanying drawings wherein like reference characters 

20 refer to like elements. 

BRIEF DESCRIPTION OF THE DRAWINGS 



FIG. 1 provides the size of EGF precursor (pre-pro-EGF) relative to 
25 correctly processed EGF. 

FIG. 2 depicts schematically the construction of pZD203, a vector used to 
modify the restriction sites on pre-pro-EGF to develop cDNA suitable for cloning 
into the plant expression vector pGA643. 

FIG. 3 depicts schematically the construction of pZD204, the plant 
30 expression vector carrying pre-pro-EGF. 
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FIG. 4 shows EGF levels seen in individual calli resulting from positive 
transformation and antibiotic selection. EGF concentrations were determined 
using enzyme-linked Lmmunosorbent assay and are based on a 30 KD protein size. 

5 DESCRIPTION OF THE PREFERRED EMBODIMENT(s) 



The present invention is a method for production of human growth factors 
using whole plants as well as plant cell suspensions transformed with appropriately 
constructed vector plasmids. wherein the human growth factor is produced with a 

10 length of at least 200 amino acids. More specifically, the method of the present 
invention is stable expression of human growth factors of interest as direct 
therapeutics, targeted delivery systems and research reagents. Human growth 
factors produced include human epidermal growth factor (hEGF), transforming 
growth factor (TGF), vascular endothelial growth factor (VEGF), platelet-derived 

15 growth factor (PDGF), fibroblast growth factor (FGF), tumor necrosis factor 
(TNF), heparin-binding epidermal growth factor (HBEGF), insulin-like growth 
factor (ILGF), platelet-derived endothelial cell growth factor (PDECGF), platelet- 
derived angiogenesis factor (PDAF), and bone-and-cartiiage inducing growth 
factor (BCIF). 

20 Any plant from the plant kingdom may be utilized. Specific types of plants 

that are amenable to the transformation steps listed herein include, but are not 
limited to monocotyledonous, dicotyledonous, and tuberous plants. Preferred 
species include but are not limited to Nicotiana tabacum (tobacco), Solanum 
tuberosum (potato). Glycine max (soybean), and Zea mays (com). 

25 The method of the present invention, a method of producing human growth 

factors from plant cells, has the steps of: 

(a) obtaining a positive transformant of the plant cells, the positive 
transformant carrying genetic material encoding the production of a human growth 
factor with a length of at least 200 amino acids; 

30 (b) cultivating the positive transformant; and 
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(c) obtaining the human growth factors. 
The step of obtaining may be as simple as purchasing or more complex 
actual making by well known methods, for example direct particle bombardment 
as described in Gene Transfer by Particle Bombardment, Klein TM, Knowlton S, 
5 Arentzen R, Plant Tissue Culmre Manual, Dl, pp 1-12, 1991, Kluwer Academic 
Publishers, or by Agrobaterium mediated transformation as described in Hoekema 
et al. 1985 (Hoekema KM, Hirsch PR, Hooykaaf PJJ, Schliperoort RA, 1985, 
Nononcogenic Plant Vectors for Use in the Agrobacterium Binary System, Plant 
Molecular Biology, Vol. 5, 85-89), and further described herein. 
10 The step of cultivating involves either whole plant cultivating or tissue 

cultivating by any of well known cultivating methods. 

The step of obtaining is by well known separation purification steps, for 
example ultrafiltration, affinity chromatography, and/or electrophoresis. 

An Agrobacterium mediated transformation method of the present invention 

15 has the steps of: 

(a) modifying chimeric cDNA encoding a specific growth factor for 

subcloning into a plant expression vector 

(b) subcloning the chimeric cDNA into the plant expression vector; 

(c) transferring the plant expression vector containing transgenic plant 

20 ceils to an agrobacterium; 

(d) co-cultivating a portion of the transgenic plant cells (suspension 
culture or leaf disks) with the agrobacterium; 

(e) selecting positive transformants from the co-cultivated culture on 

an antibiotic selective media; 
25 (f) permitting growth of the transgenic plant cells in whole plants or 

suspensions; and 

(g) extracting a liquid containing the human growth factor; 

wherein the improvement comprises: 

said human growth factor having a length of at least 200 amino acids. 
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Modifying chimeric cDNA and subcloning into a plant expression vector are 
done using standard molecular cloning procedures (Ausubel et al. 1992) and 
splicing PGR techniques (Marks et al. 1992). More specifically, modifying 
chimeric cDNA, has the steps of: 
5 (a) adding a transcription promoter to the upstream or 5' end of 

the chimeric cDNA; and 

(b) adding a transcription terminator to the downstream or 3' end 
of the chimeric cDNA. The transcription promoter and the transcription 
terminator are regulatory elements. 

10 Further, an additional regulatory element encoding a signal peptide may be added 
between the transcription promoter and the 5' end of the chimeric cDNA in order 
to relegate the product human growth factor to a specific cellular organelle. In 
addition, other regulatory elements may be added either between the promoter and 
the additional regulatory element encoding the signal peptide or at the 3' end of 

15 the chimeric cDNA to obtain greater mRNA stability between transcription and 
translation events. 

In either whole plants or cell cultures, to enhance expression of the chimeric 
gene (hEGF), the present invention further includes manipulation of a 35S 
promoter by duplication of the upstream region (-343 to -90 bp) of the CaMV 35S 

20 promoter to increase transcription activity, as well as use of TSC29 and TSC40 

promoters. These promoters and their transcription activity Tiave been reported by 
Gao et al. 1994, and Dai et al. 1995. 

In whole plants, transcription promoters may include the upstream enhancer 
(nucleotides -343 to -90 relative to the transcription start site) of the CaMV 35S 

25 promoter (Benfey et al. 1989) or the chlorophyll a/b binding protein (cab J) 

promoter (Ha and An 1988). Use of these types of regulatory elements confers 
human growth factor production characteristics into traditionally non-salable 
portions of crop plants, such as the leafy tops of potatoes. Use of potato tops, for 
example, under post-harvest conditions, results in overexpression and production 



8NSOOCIQ <WO 9B213>l8A1_l_> 



wo 98/21348 



PCT/US97/20603 



- 10 - 



10 



of human growth factor in non-salable plant portions towards the end of the 
harvesting season, without affecting crop quality. 

Transferring the plant expression vector into the agrobacterium is completed 
using the freeze-thaw method (An 1987). For monocotyledonous species, super- 
binary vectors, such as pTOK233 and pSB131. are used to achieve high 
transformation frequency Oshida et al. 1996). Remaining cocultivation. selection, 
growth, and extraction steps (d through g) have been described by Magnusen et al. 
(1996). and are well known in the art of plant molecular biology. 

Many human growth factors possess relatively short lengths of between 50 
and 100 amino acids. For example, hEGF has a length of 53 amino acids. 
Accordingly, obtaining a larger construct of at least 200 amino acids requires 
either (1) cloning the larger precursor cDNA, (2) synthesizing a concatemer 
consisting of multiple gene copies encoding the growth factor, or (3) increasing the 
overall size of a gene to be expressed using a fusion constmct encoding a growth 
15 factor linked to a protein that is efficiently produced in plant systems. 

An example of obtaining a larger precursor to increase the overall protein 
size is the cDNA encoding pre-pro-EGF. This particular gene, at approximately 
4.5 kb, encodes a 1207 amino acid protein that, in vivo, is proteolytically cleaved 
to yield 53 amino acid EOF. In plant systems, this larger protein will provide 
20 additional stability against proteolytic degradation. 

Synthesizing the cDNA concatemer is preferably doSe by ligating multiple 
gene copies using peptide linkers to obtain a processed protein length of at least 
200 amino acids. The multiple gene copies are preferably an oligomeric 
polypeptide having of repeated growth factor cDNA domains. Peptide linkers may 
25 be used that are (1) proteolytically cleaved m planta. (2) proteolytically cleaved in 
a separate enzymatic treatment step, or (3) resistant to proteolytic cleavage. 
Peptide linkers that are proteolytically cleaved by serine proteases in planta 
preferably possess the amino acid sequence Arg-Asn. This sequence already exists 
when EOF is concatemerized since the C-terminal amino acid is arginine and the 
30 N-terminal amino acd is asparagine. To achieve in planta cleavage, the processed 
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protein is targeted either to the cell cytosol (no signal peptide) or vacuole 
(phytohemagglutinin signal peptide [Chrispeels et al.l991]). To achieve 
proteolytic cleavage in a separate enzymatic treatment step, the same amino acid 
sequence is preferably used (Arg-Asn) and the growth factor concatemer is either 
5 targeted to the chloroplast (pea photosysiem II signal peptide) or secreted (PR-II 
signal peptide) to limit proteolytic degradation. To achieve resistance to 
proteolytic cleavage, linkers would preferably possess the amino acid sequence 
Arg-Pro. This sequence is resistant to serine proteases. Specifically for EGF, 
linkage would preferably be achieved by synthesizing cDNA encoding a single 

10 proline unit between growth factor monomers cDNA. 

Increasing the overall size of a gene may be done by ligating EGF with 
cDNA encoding a protective protein to protect from proteolytic cleavage, thereby 
forming a fusion construct. Protective proteins include but are not limited to 
streptococcal protein G or -galactosidase, that have both been shown to inhibit 

15 proteolysis when attached to the C-terminus of other foreign proteins (Hellebust et 
al. 1989). Gene size could also be increased by ligating EGF with cDNA 
encoding another protective protein of commercial interest that processes well in 
plant-based systems. Protective proteins further include human serum albumin 
(Sijmons et al. 1990) and phytase (Verwoerd et al. 1995). 

20 At least one genetic regulatory element may be included in the cDNA 

encoding the transcription of specific growth factors. Regulatory elements include 
transcription promoters or enhancers that increase the frequency of transcription 
events, leader sequences that increase the stability of mRNA prior to translation, 
and signal peptides that target proteins to specific organelles for posttranslational 

25 modifications and accumulation. Examples of transcription enhancers include but 
are not limited to the octapine synthase enhancer, a 16 bp palindrome 
(ACGTAAGCGCTTACGT) (Ellis et al. 1987) and the B-domain of the 
cauliflower mosaic virus 35S promoter (Kay et al. 1987). An example of a leader 
sequence includes but is not limited to alfalfa mosaic virus RNA4 leader sequence 

30 (Jobling and Gehrke 1987). Examples of signal peptides include but are not 
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limited to the tobacco PR-S signal peptide (Comelissen et ai. 1986) and the 
phytohemagglutinin signal peptide (Hunt and Chrispeels 1991). 

Example 1 

5 The bacteriophage XEGFl 16 (ATCC No. 59956) containing the gene 

encoding the ftill length polypeptide of human kidney pre-pro-EGF was obtained 
from ATCC. Pro-EGF (FIG. 1) is the 1207 amino acid precursor in which hEGF 
is flanked by polypeptide segments of 907 and 184 residues at its NH,- and 
COOH-termini. respectively (Bell et al., 1986). The remainder of the 4.8 kb pre- 
10 pro-EGF gene encodes native signal peptides at both the NH,- and COOH- termini 
of pro-EGF. The polypeptide contains a transmembrane (TM) binding region that 
facilitates proper cleavage in the endoplasmic reticulum. 

The full length of cDNA was excised with Sma I, Hind 01, and Eco RI 
restriction enzymes, as shown on FIG. 2, producing two separate fragments. 
15 These were sequentially ligated into compatible Sma I and Eco RI sites in 

pBluescript- creating the 7.5 kb plasmid pZD203. After proper orientation was 
confirmed, pre-pro-EGF cDNA was further excised with Xba I and Cla I 
restriction enzymes and ligated into compatible sites located between the CaMV 
35S promoter and T, transcription terminator of binary vector pGA643, forming 
20 the 16 kb plasmid pZD204 (FIG. 3). This plasmid was directly transferred into 
Agrobacterium tumefaciens LBA4404 using the freeze-thaw method (An 1987). 
The transferred plasmid was introduced into tobacco whole plants (by leaf disks) 
and calli (by suspension culmre) by co-cultivation with the Agrobacterium thereby 
producing transformants. Over 200 specific samples of transformants were taken 
25 from the co-cultivation and separately placed on kanamycin selective media. The 
co-cultivated transformants that grew were positive transformants. The positive 
transformants were screened under kanamycin selection pressure and preliminaiy 
ELISA results indicated the presence of hEGF in tobacco calli. Accumulation 
levels of hEGF in select transgenic calli are shown on a ng/g fresh weight basis in 
30 FIG. 4. The bars in FIG. 4 represent a random sample of the specific samples of 
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iransformants. The highest level of accumulation at approximately 400 ng/(g fresh 
weight cells) (ppb) corresponds to a concentration of 4.1 ng/(mg total soluble 
protein) (ppm) (based on a measured total soluble protein level of approximately 
98 mg/(g fresh weight cells)). The 4.1 ng/(mg total soluble protein) (ppm) 
5 corresponds to 4100 pg/(mg total soluble protein) (ppb) which is almost two 
orders-of-magnitude greater than the result of 60 pg/(rag total soluble protein) 
(pph) reported by Higo et al. (1993). 

Funher ELISA and Northern blot analyses were used to detect high levels of 
foreign protein production and mRNA transcription, respectively. Western blot 
10 analysis, completed to determine protein size, showed that specific EGF bearing 
constructs of 30 KD were produced. This size corresponds to approximately 250 
amino acids. 

Closure 

15 While a preferred embodiment of the present invention has been shown and 

described, it will be apparent to those skilled in the art that many changes and 
mtxlifications may be made without departing from the invention in its broader 
aspects. The appended claims are therefore intended to cover all such changes and 
mcxJifications as fall within the true spirit and scope of the invention. 

20 
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GENBRAL INFORMATION: 



{i) APPLICANT: Brian S. Hooker, et al 

(ii) TITLE OF INVENTION: Method of Producing Human Growth 
Factors From Whole Plants or Plant Cell Cultures 

(iii) NUMBER OF SEQUENCES: 8 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Paul W. Zirrjuerman 
{B} STREET: P.O. Box 999 

(C) CITY: Richland 

(D) STATE: WA 

(E) COUNTRY: USA 

(F) ZIP: 99352 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3 1/2 Magnetic Disk 

(B) COMPUTER: IBM compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: W0RD97 

(VI) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/747,246 

(B) FILING DATE: 11-12-96 

(C) CLASSIFICATION: unknown 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: N/A 

(B) FILING DATE: 
(v:ii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Paul W. Zimmerman 

(B) REGISTRATION NUMBER: 34,761 

CO REFERENCE/ DOCKET NUMBER: E-1519 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 509-375-2981 

(B) TELEFAX: 509-375-2592 

(C) TELEX: 
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(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4431bp 

(B) TYPE: Nucleic acid 

CO STRANDEDNESS : double strands 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE; 

(A) DESCRIPTION: Sense orientation of complementary DNA 
for pro-EGF 

(iii) HYPOTHETICAL: 

(IV) ANTI-SENSE: 5'-AGT GAC TCA GTC GAG ... TTC TCA CTC 

GTC-3 end 

(V) FRAGMENT TYPE: 4.5kh Smal/Hindlll double strands DNA 

fragment 

(VL) ORIGINAL SOURCE: 

(A) ORGANISM: kidney 

(B) STRAIN: human 

(C) INDIVIDUAL ISOLATE: GI Belle 

(D) DEVELOPMENTAL STAGE: adult 
(£) HAPLOTYPE: 

(F) TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE: 

(I) ORGANELLE: 

[VI 1) IMMEDIATE SOURCE: 

(A) LIBRARY: fetal human liver library 

(B) CLONE: lambda CH4A; lambda EMBL4 ; lambda GM1416 
{viii.) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: 

(B) MA? POSITION: 

(C) UNITS: 
: i y. ) FEATURE : 

(A) NAME/KEY: human epithelial growth factor cDNA 

(B) LOCATION: 

(C) IDENTIFICATION METHOD: cross-hybridization 

with mouse cDNA 

(D) OTHER INFORMATION: 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: 
(3) TITLE: 

(C) JOURNAL: 

(D) VOLUME: 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: 

(H) DOCUMENT NUMBER: " " 

(I) FILING DATE: 

(J) PUBLICATION DATS: 

(K) RELEVANT RESIDUES IN SEQ ID NO: 

FROM (position) TO (position) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



CCCGGGCCAT GCTCCAGC/^A AATCAAGCTG TTTTCTTTTG AAAGTTCAAA CTCATCAAGA TT 62 



ATG 


CTG 


CTC 


ACT 


CTT 


ATC 


ATT 


CTG 


TTG 


CCA 


GTA 


GTT 


TCA 


AAA 


TTT 


AGT 


TTT 


GTT 


116 


AGT 


CTC 


TCA 


GCA 


CCG 


CAG 


CAC 


TGG 


AGC 


TGT 


CCT 


GAA 


GGT 


ACT 


CTC 


GCA 


GGA 


AAT 


170 


GGG 


AAT 


TCT 


ACT 


TGT 


GTG 


GGT 


CCT 


GCA 


CCC 


TTC 


TTA 


ATT 


TTC 


TCC 


CAT 


GGA 


AAT 


224 


AGT 


ATC 


TTT 


AGG 


ATT 


GAC 


ACA 


GAA 


GGA 


ACC 


AAT 


TAT 


GAG 


CAA 


TTG 


GTG 


GTG 


GAT 


278 


GCT 


GGT 


GTC 


TCA 


GTG 


ATC 


ATG 


GAT 


TTT 


CAT 


TAT 


AAT 


GAG 


AAA 


AGA 


ATC 


TAT 


TGG 


332 


GTG 


GAT 


TTA 


GAA 


AGA 


CAA 


CTT 


TTG 


CAA 


AGA 


GTT 


TTT 


CTG 


AAT 


GGG 


TCA 


AGG 


CAA 


386 


GAG 


AGA 


GTA 


TGT 


AAT 


ATA 


GAG 


AAA 


AAT 


GTT 


TCT 


GGA 


ATG 


GCA 


ATA 


AAT 


TGG 


ATA 


440 


AAT 


GAA 


GAJ\ 


GTT 


ATT 


TGG 


TCA 


AAT 


CAA 


CAG 


GAA 


GGA 


ATC 


ATT 


ACA 


GTA 


ACA 


GAT 


494 


ATG 


AAA 


GGA 


AAT 


AAT 


TCC 


CAC 


ATT 


CTT 


TTA 


AGT 


GCT 


TTA 


AAA 


TAT 


CCT 


GCA 


AAT 


548 


GTA 


GCA 


GTT 


GAT 


CCA 


GTA 


GAA 


AGG 


TTT 


ATA 


TTT 


TGG 


TCT 


TCA 


GAG 


GTG 


GCT 


GGA 


602 


AGC 


CTT 


TAT 


AGA 


GCA 


GAT 


CTC 


GAT 


GGT 


GTG 


GGA 


GTG 


AAG 


GCT 


CTG 


TTG 


GAG 


ACA 


656 
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.CA GAG ^J. ATA ACA GCT GTG TCA TTG GAT GTG CTT GAT AAG CGG CTG TTT TGG 7^0 
ATT CAG TAC AAC AGA GAA GGA AGC AAT TCT Cli fti ^^G 818 

G^A GGT TCT GTC CAC ATT AGT AAA CAT CCA ACA CAG CAT 

TCC CTT TTT GGT GAC CGT AiC TTC TAT A^" ^^-^^ Tj;;^ 926 

ATA GCC AAC AAA CAC ACT GGA AAG GAC ATG GTT AGA ATT AA 

TTT GTA CCA CTT GGT GAA CTG AAA GTA GTG CAT CCA Cl^ ^ ^^^^ ^03^ 

m ^AT GAC ACT TGG GAG CCT GAG CAG AAA CTT TGC AAA ^^^^ ^^^^ 

III SOA TAC GCC CtK mi CGA GAC CGG AAG TAC TGT GAA GAT .TT AA. 

GCT TTT TGG AAT CAT GGC TGT ACT CTT GGG TGT AAA AAC ^ 

TAC TGC ACG TGC CCT GTA GGA TTT GTT CTG CTT CUl ^ ^334 
?^ CTT GTT TCC TGT CCA CGC AAT GTG TCT GAA TGC AGC CAi ^^^g 
A?^ 5cA GAA GGT CCC TTA TGT TTC TGT CCT GAA GGC TCA GT ^ 
GGG AAA ACA TGT AGC GGT TGT TCC TCA CCC GAT AAT ^^1 ^j^^, ^^gg 

CCT CTT AGC CCA GTA TCC TGG GAA TGT GAT TGC ill ^ ^^^O 
CTA cll CTG gIt GAA AAA AGC TGT GCA GCT TCA GGA CCA CAA CCA TTT ^^^^ 

iS i ir= - - Afc 1 1 1 - - - - 
S I i fc? i^i III - i i i - - - ^ ^"^^ 

^C^G ^^T ^G Tg? ^rT I g - AAA ATA A^C ACT AAG JS.. 

r-nr RAC ATC TCT CAA CCA CGA GGA ATT GCT GTT CAT CCA Ai^ u ^^^^ 
?TC AC? GA? ACA GGG ATT AAT CCA CGA ATT GAA AG. TCT TCC CTC 

r-TT CGC CGT CTG GTT ATA GCC AGC TCT GAT CTA Ai^ ^ 2060 
aS GAC TTC TTA ACT GAC AAG TTG TAG TGG TGC GAT GCC AAG CAG 

GAA ATG GCC AAT CTG GAT GGT TCA AAA CGC CGA AGA CTT A ^168 

GGT CAC CCA TTT GCT GTA GCA GTG TTT GAG GAT TAT GTG ^ ^222 
GC? ATG CCA TCA GTA ATA AGA GTA AAC AAG AGG ACT GGC AAA GAT 

TTC CAA GGC AGC ATG CTG AAG CCC TCA TCA Lib bii ^^^^ 2330 

SIS CCA GGA GCA GAT CCC TGC TTA TAT CAA AAC GGA GGC TGT ^ ^ ^^^^ 

^ KaG AGG CTT GGA ACT GCT TGG TGT TCG TGT CGT GAA GGl 24 38 
^ Sat GGG AAA ACG TGT CTG GCT CTG GAT GGT CAT CAG CT 

GAA GTT GAT CTA AAG AAC CAA GTA ACA CCA TTo GAC AIL ^54 6 

G?^ TCA GAA GAT AAC ATT ACA GAA TCT CAA CAC ATG CTA bio ^600 

-TG ?CA GAT CAA GAT GAC TGT GCT CCT GTG GGA TGC AGC A.^ i 2654 

ATT TCA GAG GGA GAG GAT GCC ACA TGT CAG TGT TTG AAA G^A ^^^g 

GGA CTA TGT TCT GAT ATA GAT GAA TGT GAG ATG GGT GTC CCA ^^^^ 
CC- ^ TCC TCC AAG TGC ATC AAC ACC GAA GGT GGT TAT GTC 1 cAA CTG 2 816 
G^ GGC TAC CAA GGA GAT GGG ATT CAC TGT CTT GAT A^T GAT 

GTG CAC AGC TGT GGA GAG AAT GCC AGC TGC ACA AAi ^92 4 
aSc ?GC ATG TGT GCT GGA CGC CTG TCT GAA CCA GGA AAT AGT 

CCC CTG TCC CAC GAT GGG TAC TG. CTC CAT GAT bbl 3032 

GCA TTG GAC AAG TAT GCA TGC AAC TGT GTT GTT Gb^ 3O86 

CAG TAC CGA GAC CTG AAG TGG TGG GAA CTG CGC CTG AT q 

CCA CCC CCT CAC CTC AGG GAA GAT GAC CAC CAC TA^ l^b 3^3^ 
CAC GGG CAG CAG CAG AAG GTC ATC GTG GTG GCT GTC TGC GTG 

ATG CTG CTC CTC CTG AGC CTG TGG GGG GCC CAC TAC lAU n 33Q2 

CTA TCG AAA AAC CCA AAG AAT CCT TAT GAG GAG TCG AGC A 33^^ 

CCC AGG CCT GCT GAC ACT GAG GAT GGG ATG TCC TCT TGC CL^ 3^^^^ 

G?G GT? ATA AAA GAA CAC CAA GAC CTC AAG AAT GGG GGT CAA CCA ^ ^^^^ 

GAG GAT GGC CAG GCA GCA GAT GGG TCA ATG CAA CCA ACT .LA 3^^g 

CCC CAG TTA TGT GGA ATG GGC ACA GAG CAA GGC TGC TGG AT| 3^^^ 

GAT Sag GGC TCC TGT CCC CAG GTA ATG GAG CGA AGC TTT CAI Ai 3^26 
GGG ^A CAG ACC CTT GAA GGG GGT GTC GAG AAG CCC CAT TC. 

AAC CCA TTA TGG CAA CAA AGG GCC CTG GAL Lba uo/^ 3586 

CAG TGA 3^^g 

^W^CTGGAAT T.W^GGAAA GTCAAGAAGA AT^^^CTATG TC^AT^C^^^ ^I^^^fc'cTA 3806 

TTTCAA-AAGT AGAGCAAAAC TATAGGT.TT GGTTCCACAA TC ^^^^ gCAGTCTCAC 38 66 

CTCAATGCCT GGAGACAGAT ^CGTAGT^GT ^^TT^TGTTT ^ ^^^^ TTAGAAACCC 392 6 

TGCAGTCTTA TTTCCAAGTA AGAGTACTGG ^AGAATLALi ^ ^^A CAAATAGATT 398 6 

AaA-TGGGAC ;iJ^CAGTGCTT TGTAAATTGT GTTGTCTTCA ^CAb^ aCAGTCACAC 404 6 

TTT-TTTTTG TTGTTCCTGC AGCCCCAGAA GAAATTAGGG ATCAGTTTCA 4106 

TGG??TGGTC AGTTACAAAG TAATTTCTTT GATCTGGACA GAACATTTA. A^^^^^^^^^ ,,ee 

SgGT^C A?rCTSI?GA ?TTfGC^I^l ^T^IS^TG^G ItGAATCAAT GAAAAATGTA 
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ATTTAGAAAC TGATTTCTTC AGAATTAGAT GGCCTTATTT TTTAAAATAT TTGAATGAAA 42B6 

ACATTTTATT TTTAAAATAT TACACAGGAG GCCTTCGGAG TTTCTTAGTC ATTACTGTCC 4 34 6 

TTTTCCCCTA CAGAATTTTC CCTCTTGGTG TGATTGCACA GAATTTGTAT GTATTTTCAG 4 406 

TTACAAGATT GTAAGTAAAT TGCCTGATTT GTTTTCATTA TAGACAACGA TGAATTTCTT 4 4 66 

CTAATTATGA ATTC 4 4 80 
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I'i) INFORMATION FOR St-Q ID NO; 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 783bp 

(3) TYPE: Nucleic acid 

(C) STRANDEDNESS : double strands 

(D) TOPOLOGY; unknown 

(xx) MOLECULE ^^^^^^ 33^3^ orientation of five 

(A) DEbLKifi copies Of mature EGF concatemers 

liirAS^I-S^^S^^^sLcGC^GTC AAG GGT , . . TCT CAG TGA TAA-3 

Cv) FRAGMENT TYPE: 4 Jkb Smal/Hindlll double strands DNA 
^ fragment 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: kidney 

{B) STRAIN: human 

(C) INDIVIDUAL ISOLATE: Z.Dai, et al. 

(D) DEVELOPMENTAL STAGE: adult 

(E) HAPLOTYPE: 

(F) TISSUE TYPE; 

(G) CELL TYPE: 

(H) CELL LINE: 

(I) ORGANELLE: 

(v.x) I^^^^^^f f.tal human liver library 

SbI cio^: la^da CH4A; lambdaEMBL4 ; lambda GM1416 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: 

(B) MAP POSITION: 

(C) UNITS: 

(^x) H-EATURE: ^ Concatemer of mature EGF fragment 

without linker 

!h) LOCATION: 

(C) IDENTIFICATION METHOD: PGR cloning 

(D) OTHER INFORMATION: 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: 
(Bl TITLE: 
(C) JOURNAL: 
(□} VOLUME: 

(E) ISSUE: 

(F) PAGES: 

( G ) DATE : 

(H) DOCUMENT NUMBER: 

(I) FILING DATE: 

CJ) PUBLICATION DATE: 
K) RELEVANT RESIDUES IN SEQ ID NO. _ . ^ . 

^ ' FROM (position) TO (position) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
AAT AGT CAC TCT GAA TGT CCC CTC TCC C^C GAT GGG TAG TGC CTC CAT GAT G3T 54^ 
GTG TGC ATG TAT ATT GAA GCA TTG GAC AAG TAT GCA TGC 

TAG ATC GGG GAG CGA TGT CAG TAG CGA GAC CTG AAG Tb^ ^IS 
ASt GAC TCT GAA TGT CCC CTG TCC CAC GAT GGG TAG TGC CTC ^70 
TGC ATG TAT ATT GA.^ GCA TTG GAC AAG TAT GCA TGC AAC 
ATC GGG GAG CGA TGT CAG TAG CGA GAC CTo AAG TGG 1 

GAC TCT GAA TGT CCC CTG TCC CAC GAT GGG TAG ibU ^32 
ill lil ill C^aS -C ^C^A C?G ^A^S ?G^G ^G GAA CTG CGC AAT AGT GAC 436 
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TCT GAA TGT CCC CTG TCC 
TAT ATT GAA GCA TTG GAC 
GAG CGA TGT CAG TAC CGA 
GAA TGT CCC CTG TCC CAC 
ATT GAA GCA TTG GAC AAG 
CGA TGT CAG TAC CGA GAC 



-21- 

CAC GAT GGG TAC TCC CTC CAT 

AAG TAT GCA TGC AAC TGT GTT 

GAC CTG AAG TGG TGG GAA CTG 

GAT GGG TAC TGC CTC CAT GAT 

TAT GCA TGC AAC TGT GTT GTT 

CTG AAG TGG TGG GAA CTG CGC 



GAT GGT GTG TGC ATG 540 
GTT GGC TAC ATC GGG 594 
CGC AAT AGT GAC TCT 64 8 
GGT GTG TGC ATG TAT 702 
GGC TAC ATC GGG GAG 756 
795 
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(4) INFORMATION FOR SEQ NO: 3: 

(il SEQUENCE C:-iAKACTERISTICS: 
(A) LENGTH: 8 91bp 
(3) TYPE: Nucleic acid 

(C) STRANDEDNESS: double strands 

(D) TOPOLOGY: unknown 



PCTAjS97/20603 



(ii) MOLECULE TYPE: ^^^^^^ ^^^^^ orientation concatemer of _ 
mature EGF fragments with xinK«i-=, 



/ n 1 



(iii) HYPOTHETICAL: taa-3 end 

linked with linkers 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: kidney 

{B) STEIAIN: human . 

(C) INDIVIDUAL ISOLATE: 2. Dai, et al 

(D) DEVELOPMENTAL STAGE: adult 
{£) HAPLOTYPE: 

(F) TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE: 

( I ) ORGANELLE : 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: 

(B) CLONE: 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/ SEGMENT: 

(B) MAP POSITION: 

(C) UNITS: 

(ix) FEATURE: ^^^^^^^^ ccaoatem.r of mature EGF linked with linkers 

(B) LOCATION: 

(C) IDENTIFICATION METHOD:PCR cloning .g. 307-331, 

(D) OTHER INFORMATION: Cleavage sites at 142 165, 3U / J^i, 

4fi5-489, 531-655. 

PUBLICATION INFORMATION: 

(A) AUTHORS: 

(B) TITLE: 

(C) JOURNAL: 

(D) VOLUME: 

(E) ISSUE: 
{F) PAGES: 

(G) DATE: 

(H) DOCUMENT NUMBER: 

(I) FILING DATE: 

(JJ PUBLICATION DATE: 

(K) RELEVANT RESIDUES IN SEQ ID NO: 

FROM (position) TO (position) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
;^T AGT GAC TCT GAA TGT CCC CTG TCC C^C GAT GGG TAC T^C CTC CAT GAT GGT 5 J 
GTG TGC ATG TAT ATT GAA GCA T.G GAC AAG TAl bCg 

TAC ATC GGG GAG CGA TGT CAG TAC CGA GAC CTG AA^ 216 
GGA AGA GTT AAC TGC ATG CAG AAT AGT GAC TLl bAA 0 
GGG TAC TGC CTC CAT GAT GGT GTG TGC ATG TAT Aii^ b 324 
GCA TGC AAC TGT GTT GTT GGC TAC ATC GGG GAG CbA ^^.^ 
AAG TGG TGG GAA CTG CGC GGC GGA AGA GTT AAC IbU ^ 
GAA TGT CCC CTG TCC CAC GAT GGG TAC TGC CTC ^Ai ^^^^ 
AT? GAA GCA TTG GAC AAG TAT GCA TGC AAC TGT GTT GTi ^ 
CGA TGT CAG TAC CGA GAC CTG AAG TGG TGG GAA Lib 

TGC ATG CAG AAT AGT GAC TCT GAA TGT CCC ^IL. i^^ ^ ^^^^ r^^rj. ^43 

CAT Sat GGT GTG TGC ATG TAT ATT GAA GCA TTG GAC AAG TAT 

fT^G l^l ^ I t| I I -T GAC TCT GAA TGT CCC CTG 

Hi ^A^i ?fT fcl ?fc I^^C I I P? ATC GGG GAG CGA TGT CAG TAC B.4 

CGA GAC CTG AAG TGG TGG GAA CTG C*^C 
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(5) INFORMATION FOR SEQ ID NO: 4: 

{X) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 330bp 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: double strands 
CD) TOPOLOGY: unknown 
{ii) MOLECULE TYPE: 

(A) DESCRIPTION: upstream enhancer (from -343 to 
-90 bp) of 35S promoter 

(iii) HYPOTHETICAL: 

(iv) ANTI-SENSE: 

(V) FRAGMENT TYPE: 253bp upstreani of 35S promoter 
enhancer element 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: cauliflower mosaic virus (CaMV) 

(B) STRAIN: Cabb B-D 

(C) INDIVIDUAL ISOLATE: Z.Dai, et al 

(D) DEVELOPMENTAL STAGE: 

(E) HAPLOTYPE: 

(F) TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE 

( I ) ORGANELLE 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: 

(B) CLONE: 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/ SEGMENT: 

(3) MA? POSITION: 

(C) UNITS: 

(ix) FEATURE: 

(A) NAME/KEY: 35S promoter B-domain enhancer 

(B) LOCATION: 

(C) IDENTIFICATION METHOD; standard cloning 

(D) OTHER INFORMATION: B-domain of 35S promoter from EcoR V 
site to Hind II site {upstream enhancer region from -343 to -90 bp) 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: 

(B) TITLE: 

(C) JOURNAL: 

(D) VOLUME: 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: 

(H) DOCUMENT NUMBER: 

(I) FILING DATE: - ' " 
(J) PUBLICATION DATE: 

(K) RELEVANT RESIDUES IN SEQ ID NO: 

FROM (position) TO (position) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTCAACATGG TGGAGCACGA CACACTTGTC TACTCCAAAA ATATCAAAGA TACAGTCTCA 60 

GAAGACCAAA GGGCAATTGA GACTTTTCAA CAAAGGGTAA TATCCGGAAA CCTCCTCGGA 120 

TTCCATTGCC CAGCTATCTG TCACTTTATT GTGAAGATAG TGGAAAAGGA AGGTGGCTCC ISO 

TACAAATGCC ATCATTGCGA TAAAGGAAAG GCCATCGTTG AAGATGCCTC TGCCGACAGT 24 0 

GGTCCCAAAG ATGGACCCCC ACCCACGAGG AGCATCGTGG AAAAAGAAGA CGTTCCAACC 300 

ACGTCTTCAA AGCAAGTGGA TTGATGTGAT 330 



SUBSTITUTE SHEET (RULE 26) 



BNSOCCID: <WO 9eai348AlJ_> 



wo 98^21348 



PCTA;S97/20603 



-24- 

(6) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1441bp 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : double strands 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE-^^^^^^^, ^ < -untranscription region of 

vrtj L.L-^^ ^ binding protein 

(iii) HYPOTHETICAL: 

fiv) ANTI-SENSE: , ^ 

(V) FRAGMENT TYPE: llkb EcoR 1 fragment 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: whole plants 

(B) STRAIN: Arabidopsis 

(C) INDIVIDUAL ISOLATE: Ha et al 

(D) DEVELOPMENTAL STAGE: 30 day old seedlings 

(E) HAPLOTYPE: 

(F) TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE: 

( I ) ORGANELLE : 

(vii) IMMEDIATE SOURCE: _ -.-k^.^w 

(A) LIBRARY: genomic DNA library 

(B) CLONE: lair±)da bATlOOS 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: 

(B) MAP POSITION: 

(C) UNITS: 

(ix) FEATURE: ^^^^j^^: arabidopsis cabl gene promoter 

(B) LOCATION: ^ , ... „ 

(C) IDENTIFICATION METHOD: cros s -hybridization 

(D) OTHER INFORMATION: 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: 

(B) TITLE: 

(C) JOURNAL: 

(D) VOLUME: 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: 

(H) DOCUMENT NUMBER: 

(I) FILING DATE: 

(J) PUBLICATION DATE: , 
(K) RELEVANT RESIDUES IN SEQ ID NO: - 

FROM (position) TO (position) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

= il ill in SiH = 

AATTTATAAG GAAATGAATA GAGAAATCAA ATCAiibA/^^ rVTTTrrTCT CACATTATAG 4 80 
AGGTCAGGTC TAAGAAAATA TTCCTGAAGC TCAAAAAAGA GTTTTCCTCT CACATT^^^ 

AATTGGCCTT TACTTCAACA T TTCCCACC TATTCCALAl ^ AGTTATATGG 600 

TACTTGTGGA TCAATTTCCG GTTGAAATGG GTTTGGTGAA ^ATCCG^i ^^^^^ ggg 

TGGCCGTTGG AATTGGCTTA TTAGTTGTGG CCGTTGTTGA ^GCCGTT ^^^^^^^^^^ ,20 

GAGAAGCAGA CTTGTGGCTA TGAGTCTATG ^CCA^ACTC ^^^^ TCTAATAAAA 780 

TGACCCTGAC CATCACCTTG ATCiGGTGGA TTCCAATGTT TCTGTGGTAA 840 

TATTATGGTC AATACAGGTG SI^JJ^S^ ^C^T^^^C ATACATAAAT TTTATAGTTT 900 

AGTTTGATTC AATTCCGTAG .TTTAGATAA ^^TTA TTCC TCAGAAGAAG 9 60 

™SS SrcS^A?! T^™S ?T^^T StATTATACA AGGCAATTAT 1020 
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CCAAATTTTT TTTGTTTTGG TTTACATTGA 
TCTATTCGTA TACGTGTCAC GTCATGAGTG 
GATATCTAAA ACACATATCA ATTGCGAATC 
AAACAAACAA TCTAAACCCC AAAAAAAATC 
GATATTTCAA GATAAGACAG TAT TT AG ATT 
ATACCAAACC ACCCATTTCT TGGCTTACAA 
GCACTACTCA ACCTTAATGG CCGCCTCAAC 
T 



-25- 

TGCTCTCAGG ATTTCATAAG GATAGAGAGA 1080 

GGTGTTTCGC CAATCCATGA AACGCACCTA 114 0 

TGCGAAGTGC GAGCCATTAA CCACGTAAGC 1200 

TATGACTAGC CAATAGCAAC CTCAGAGATT 12 60 

TCTGTATTAT ATATAGCGAA AATCGCATCA 1320 

CAACAAATCT TAAACGTTTT ACTTTGTGCT 1380 

AATGGCTCTC TCCTCCCCTG CCTTCGCCGG 14 40 

1441 



3NSDOCID: <WO 9821348A 1_l_= 



SUBSTITUTE SHEET (RULE 26) 



wo 98/21348 



prT/US9 7/20603 



-26- 



(7) 



INFORMATION FOR SEQ ID NO: 6: 



Ml SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832bp 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: double strands 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: • ^- ^ 

(A) DESCRIPTION: CaMV 353 5 ' -untranscription 
upstream 

(iii) HYPOTHETICAL: 

\tV ""^^GMeS^^TYPE: Alu 1 (from 7 14 3bp) -EcoRl ( to 7517bp) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: cauliflower mosaic virus 

(B) STRAIN: cM4-184 

(C) INDIVIDUAL ISOLATE: RJ Shepherd 
{D) DEVELOPMENTAL STAGE: 

(E) HAPLOTYPE: 

(F) TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE: 
{ I ) ORGANELLE : 

(vii) IMMEDIATE SOURCE: . ^ .o. 

(A) LIBRARY: genomic library of CM4 104 

[B) CLONE: pOS-1 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: 

(B) MAP POSITION: 

(C) UNITS: 

(ix) FEATURE: 

(A) NAME/KEY: CaMV 35S promoter 

(B) LOCATION: ^ ^ ■ „ 

(C) IDENTIFICATION METHOD: cross-hybridization 

(D) OTHER INFORMATION: 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: 

(B) TITLE: 

(C) JOURNAL: 

(D) VOLUME: 
{£) ISSUE: 

(F) PAGES: 

(G) DATE: 

(H) DOCUMENT NUMBER: 

(I) FILING DATE: 

(J) PUBLICATION DATE: 

(K) RELEVANT RESIDUES IN SEQ ID Na: ' " _ 

FROM (position) TO (position) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

iii iiii iiiii 111 iii 11 i 

iililiiiiiiiliilii 
iliiiiiiiiiiiiiipl 

GT^CMTTCA ^TTGGaHaGA ACACGGGGGA CTCTAGAGGA TCCCCGGGTG GTCAGT B32 
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(8) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 473bp 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: double strands 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: 5 ' -untranscript ion upstream of 
ribosoraal protein L34 

(iii) HYPOTHETICAL: 

(iv) ANTI-SENSE: 

(V) FRAGMENT TYPE: 1500bp BamH-Hind 111 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: tobacco NTl cells 

(B) STRAIN: 

(C) INDIVIDUAL ISOLATE: Z.Dai, et al 

(D) DEVELOPMENTAL STAGE: 3 days old 

(E) HAPLOTYPE: 
{F} TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE: NTl 

(I) ORGANELLE: 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: genomic library 
(8} CLONE: TSC 40 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: 

(B) MAP POSITION: 

(C) UNITS: 

(ix) FEATURE: 

(A) NAME/KEY: RPL-34 promoter 
(3) LOCATION: 

(C) IDENTIFICATION METHOD: plaque hybridization 

(D) OTHER INFORMATION: 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: 
(3) TITLE: 
(C) JOURNAL: 

(0) VOLUME: 

(E) ISSUE: 
:F) PAGES: 

(G) DATE: 

(H) DOCUMENT NUMBER: 

(1) FILING DATE: 

(J) PUBLICATION DATE: 

{K] RELEVANT RESIDUES IN SEQ ID NO: - . 

FROM (position) TO (position) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

AGATCTCT CTTTGTATTC TTATTGATGT ACTGGTTTGA AGATGAATAA AATCTTTCAT 58 

TCCACCAAAA AAAGAATGAA AATAAAATTT TAATATACAT GTTGATATAG ACAAAGAAGA 118 

AAAAAAAAGT TGTGATTACA TTTATTGACT ATTTGATGCC AATATCTATA ACTAGAGCTA 17 8 

TTTTCTATCA ATTATATGGG TATGTTGTTA TACCATGCCA AAACCTCAAT TCATAATGTG 238 

CTTGTTTAAA CCCAGTTTAA TGGGCTAACA TGTTGATGGG CTTATAGGCC CGTCTGATTT 298 

CCTTGCCAGA CACTAGTAAG TAAATGATTC TATCATCCAA TATCAACCGT GGGATCTAGG 358 

GCTTGTCCCA CTTATATACA CTACATATAT TTAACTTTCC TTTAGCCCTT CTGCTTCAGC 418 

CCCCAAAACA AAGAAAGAAG CTACAGAGAG AATAGCAGCG CCGCCGTGAA AAATG 4 73 
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J) INFORMATION FOR SEQ ID NO: 8: 

( i ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 1162bp 
(3) TYPE: Mucleic acid 

(C) STRANDEDNESS : double strands 

(D) TOPOLOGY: unknown 

<^^' ^?^-^=n^^^RlP??0.: S-untranscription region of 35S_.ene 

- ' from CaMV with l copies uf B ^om«^w^ 

(iii) HYPOTHETICAL: 

\iV ''r^GMlSf TVPS: 253bp Hindlll /EcoRV frapnant . 3.3bp Hind 

^ ' 11/EcoRl fragment 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: whole cell 

(B) STRAIN: CM4-184 

CO INDIVIDUAL ISOLATE: Z.Dai, et al 

(D) DEVELOPMENTAL STAGE: 

(E) HAPLOTYPE: 

(F) TISSUE TYPE: 

(G) CELL TYPE: 

(H) CELL LINE: 

(I) ORGANELLE: 

;vii) IMMEDIATE SOURCE: .^Hrprv of CM4-184 

(A) LIBRARY: genomic library or L-ni 

(B) CLONE: POS-1 
:v-ii) POSITION IN GENOME: 

{A) CHROMOSOME /SEGMENT: 

(B) MA? POSITION: 

(C) UNITS: 

FEATURE: ^^^^^^^^^ 353 promoter with duplication 

of upstream B domain 

(B) LOCATION: 

(C) IDENTIFICATION METHOD: 

(D) OTHER INFORMATION: 
fx> PUBLICATION INFORMATION: 

(A) AUTHORS: 
(3) TITLE: 
(C) JOURNAL: 
( D ) VOLUME : 

(E) ISSUE: 

(F) PAGES: 
{G) DATS: 

{H) DOCUMENT NUMBER: 
(I) FILING DATE: 

(J) PUBLICATION DATE: . ^ Mr^ ' ' 

[XI) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CCC ACACATCCTT ^GAGAGGCTT ACGCAGCAGG TCT^^ 

CGAGCA/vTAA TCTCCAGGAA ^TCAAATACC ^TCCCAAGAA ^Gl^i aTCAGAAGTA 17 3 

GATTCAGG;- TAACTGCATC AAGAACACAG AGAAAGATAT ^^^^ ATAGAGATTG 233 

CTATTCCAC-T ATGGACGATT CAAGGCTTGC TTCACAAALL. ^2 aTTCAAATAG 2 93 

gIgtctctaa aaaggtagtt cccactgaat caaaggccat ggagtcaaag ^^^^^^^^^^ 333 

AGGACCT;w\C AGAACTCGCC GTAAAGACTG GCGAACAGfi Yl'^^^^^ GTCTACTCCA 413 

tcaatg/.c.v. gaagaaaatc ttcgtcaaca tggtggagca cgacacactt l^^^^^^ ,,3 

AAAATATCAA AGATACAGTC TCAGAAGACC ^^GoCAA^ c?gtcacttt ATTGTGAAGA 533 

TAATATCCC3 AAACCTCCTC GGATTCCAT. GCCCAGCTAT ^TGIU^ aaGGCCATCG 593 

TAGTGG.WA GGAAGGTGGC TCCTACAAA GCCATCATTG ^GA^^ aGGAGCATCG 653 

TTGAAGATG- CTCTGCCGAC AGTGGTCCCA AAGATGGACC ^^^^^^^^t, GATAACATGG 713 

?ggaaaa.-.ga agacgttcca accacgtctt caaagcaagt ggattga.gt ^^^^^^^ 
tggagcacga cacacttgtc I^ctccaaaa ^tatcaaa^ cctcctcgga TTCCATTGCC 83 3 
gggcaattga gacttttcaa caaagggtaa tatccggaaa cct^^^^^^^ tACAAATGCC 893 

^?^^??G:r:^ ^^i^A^G GicAlcfTfc ^G^CTC TGCCGACAGT GGTCCCAAAG 9.3 
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ATGGACCCCC ACCCACGAGG AGCATCGTGG 
AGCAAGTGGA TTGATGTGAT ATCTCCACTG 
CTTCGCAAGA CCCTTCCTCT ATATAAGGAA 
CTCTAGAGGA TCCCCGGGTG GTCAGT 
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AAAAAGAAGA CGTTCCAACC ACGTCTTCAA 1013 
ACGTAAGGGA TGACGCACAA TCCCACTATC 107 3 
GTTCATTTCA TTTGGAGAGA ACACGGGGGA 1133 

1159 
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CLAIMS 



We claim: 

1. A method of producing human growth factors from plant cells, 

5 comprising the steps of: 

(a) obtaining a posiUve transformant of the plant cells, the positive 
transformant carrying genetic material encoding the production of a human growth 
factor with a length of at least 200 amino acids; 

(b) cultivating the positive transformant; and 
IQ (c) obtaining the human growth factors. 

2. The method as recited in claim 1, wherein obtaining the positive 

transformant has the step of: 

modifying a chimeric cDNA encoding the human growth factor with a 
15 length of at least 200 amino acids, for subcloning into a plant expression vector. 

3. The method as recited in claim 2, further comprising the steps of: 

(a) subcloning the chimeric cDNA into the plant expression vector 
and obtaining a subcloned plant expression vector; 
20 (b) transferring the subcloned plant expression vector into a 

plurality of plant cells; 

(c) selecting a plurality of positive transformants from the plurality of 

plant cells on an antibiotic selective media; 

(d) permitting growth of the portion of the plurality of plant cells in 
25 whole plants or suspensions; and 

(e) extracting a liquid containing the human growth factor from the 

plurality of transgenic plant cells. 

4. The method as recited in claim 3, wherein transferring is by direct 
30 particle bombardment. 
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5. The method as recited in claim 3, wherein transferring is by 
Agrobacterium mediated transformation. 

6. The method as recited in claim 5, wherein Agrobacteriimi mediated 
5 transformation comprises the steps of: 

(a) placing the subcloned plant expression vector to an 
agrobacterium; 

(b) co-cultivating the Agrobacterium containing the subcloned 
plant expression vector with the plurality of plant cells. 

10 

7. The method as recited in claim 1, wherein the step of cultivating is 
with a whole plant. 



8. The method as recited in claim 1, wherein the step of cultivating is 
15 with a plant tissue culmre. 

9. The method as recited in claim 1, wherein the step of obtaining is 
selected from the group consisting of ultrafiltration, affmity chromatography, and 
electrophoresis. 

20 

10. The method as recited in claim 1, wherein the length of at least 200 
amino acids is obtained by cloning a cDNA. 

11. The method as recited in claim 10, wherein said cDNA is a pre-pro- 
25 EGF cDNA. 

12. The method as recited in claim 1, wherein the length of at least 200 
amino acids is obtained by synthesizing a cDNA. 
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10 



15 



13. The method as recited in claim 12, wherein said synthesizing is 
concatomerizmg multiple gene copies to obtain the length of at least 200 amino 
acids. 

14. The method as recited in claim 1, further comprising increasing an 
overall size of a gene to be expressed with a fusion construct encoding an hEGF 
linked to a protein that is efficiently produced in plant systems. 

15. The method as recited in claim 1, wherein said human growth factor 
is selected from the group consisting of epidermal growth factor (EGF), 
transforming growth factor (TGF), vascular endothelial growth factor (VEGF), 
platelet-derived growth factor (PDGF), fibroblast growth factor (FGF), tumor 
necrosis factor (TNF), heparin-binding epidermal growth factor (HBEGF), insulin- 
like growth factor GLGF), platelet-derived endothelial cell growth factor 
(PDECGF), platelet-derived angiogenesis factor (PDAF), and bone-and-cartilage 
inducing growth factor (BCIF) and combinations thereof. 



16. The method as recited in claim 2, wherein modifying is by adding a 
regulatory element selected from the group consisting of leader sequences, signal 

20 peptides, transcription promoters or enhancers, and transcription terminators. 

17. The method as recited in claim 2, wherein modifying a chimeric 

cDNA, comprises the steps of: 

(a) adding said transcription promoter to the upstream or 5' end of 

25 the chimeric cDNA; and 

(b) adding said transcription terminator to the downstream or 3' 

end of the chimeric cDNA. 

18. The method as recited in claim 17, further comprising adding an 
30 additional regulatory element encoding a signal peptide, said additional regulatory 
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element added between the transcription promoter and the upstream 5' end of the 
chimeric cDNA. 

19. The method as recited in claim 18, further comprising adding a 
5 regulatory element between the transcription promoter and the additional 

regulatory element encoding the signal peptide to enhance mRNA stability. 

20. The method as recited in claim 18, further comprising adding a 
regulatory element at the downstream or 3' end of the chimeric cDNA to enhance 

10 mRNA stability. 

21. The method as recited in claim 17, wherein transcription promoters 
limit growth factors production to a non-crop portion of a transgenic whole plant. 

15 22. The method as recited in claim 21, wherein the transcription promoters 

are selected from the group consisting of an upstream enhancer region (-343 to -90 
bp) of a CaMV 35S promoter, a chlorophyll a/b binding promoter (cabl) and 
combinations thereof. 

20 23. The method as recited in claim 17, wherein the transcription promoters 

are selected from the group consisting of a modified 35S promoter, TSC29 
promoter, TSC40 promoter and combinations thereof. 

24. The method as recited in claim 23, wherein the modified 35S 

25 promoter is a 35 S promoter modified by duplicating an upstream enhancer region 
(-343 to -90 bp) of the 35S promoter to increase transcription activity. 

25. The method as recited in claim 2, wherein said cDNA is a pre-pro- 
EGF cDNA. 

30 
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26. The method as recited in claim 25. wherein said pre-pro-EGF cDNA 
has approximately 4.5 kb. whereby overall titers of active hEGF in both whole 
plants and cell culture are increased. 

27. The method as recited in claim 2, wherein the length of at least 200 
amino acids is obtained by synthesizing the cDNA. 



28. The method as recited in claim 27, wherein said synthesizing is 
concatomerizing multiple gene copies to obtain the length of at least 200 amino 

10 acids. 

29. The method as recited in claim 28. wherein said multiple gene copies 
are an oligomeric polypeptide having of repeated hEGF domains. 

15 30. The method as recited in claim 2, further comprising increasing an 

overall size of a gene to be expressed with a ftision construct encoding an hEGF 
linked to a protein that is efficiently produced in plant systems. 
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